Stanford NLP Seven class named entity recognition classifier not giving desired results in python

by: sahni.lakshay11, 8 years ago


I am trying to use Stanford's named Entity Recognizer. I want to use the 7 class classifier because I even want to detect time(or date) and other things in a sentence. When entering the sentence "He was born on October 15, 1931 at Dhanushkothi in the temple town Rameshwaram in Tamil Nadu." in the online demo at Stanford NLP site (http://nlp.stanford.edu:8080/ner/process) it is classifying correctly as can be seen in this image: https://i.stack.imgur.com/EE4g5.png

But, when I'm trying the code to run on my system using NLTL and StanfordTagger, I am getting wrong result. I am getting the output as:

[(u'He', u'O'), (u'was', u'O'), (u'born', u'O'), (u'on', u'O'), (u'1931-10-15', u'O'), (u'at', u'O'), (u'Dhanushkothi', u'O'), (u'in', u'O'), (u'the', u'O'), (u'temple', u'O'), (u'town', u'O'), (u'Rameshwaram', u'O'), (u'in', u'O'), (u'Tamil', u'ORGANIZATION'), (u'Nadu', u'ORGANIZATION'), (u'.', u'O')]

It is identifying the date incorrectly here as 'other' and even Tamil Nadu as an organization instead of a location. The code I've used is here below:


    from nltk.tokenize import sent_tokenize, word_tokenize
    from nltk.tag import StanfordNERTagger
    st = StanfordNERTagger('english.muc.7class.distsim.crf.ser.gz','stanford-ner.jar')
    i= "He was born on October 15, 1931 at Dhanushkothi in the temple town Rameshwaram in Tamil Nadu."
    words = nltk.word_tokenize(i)
    namedEnt = st.tag(words)
    print namedEnt


Can anyone please tell the mistake I'm doing(if any) or any other way to identify location and time in a sentence. I'm a beginner to NLP and any help regarding this would be appreciated. Thanks.




You must be logged in to post. Please login or register an account.



add  
 encoding='utf-8' 
  argument to st.

As for Tamil Nadu you may need to train it in NER. View their referense for the same.

-ryanrg 7 years ago
Last edited 7 years ago

You must be logged in to post. Please login or register an account.